Using Point-set Compression to Classify Folk Songs

نویسنده

  • David Meredith
چکیده

Thirteen different compression algorithms were used to calculate the normalized compression distances (NCDs) between pairs of tunes in the Annotated Corpus of 360 Dutch folk songs from the collection Onder de groene linde. These NCDs were then used in conjunction with the 1-nearest-neighbour algorithm and leaveone-out cross-validation to classify the 360 melodies into tune families. The classifications produced by the algorithms were compared with a ground-truth classification prepared by expert musicologists. Twelve of the thirteen compressors used in the experiment were based on the discovery of translational equivalence classes (TECs) of maximal translatable patterns (MTPs) in pointset representations of the melodies. The twelve algorithms consisted of four variants of each of three basic algorithms, COSIATEC, SIATECCOMPRESS and Forth’s algorithm. The main difference between these algorithms is that COSIATEC strictly partitions the input point set into TEC covered sets, whereas the TEC covered sets in the output of SIATECCOMPRESS and Forth’s algorithm may share points. The general-purpose compressor, bzip2, was used as a baseline against which the point-set compression algorithms were compared. The highest classification success rate of 77–84% was achieved by COSIATEC, followed by 60–64% for Forth’s algorithm and then 52–58% for SIATECCOMPRESS. When the NCDs were calculated using bzip2, the success rate was only 12.5%. The results demonstrate that the effectiveness of NCD for measuring similarity between folksongs for classification purposes is highly dependent upon the actual compressor chosen. Furthermore, it seems that compressors based on finding maximal repeated patterns in point-set representations of music show more promise for NCD-based music classification than general-purpose compressors designed for compressing text strings.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bundeli Folk-Song Genre Classification with kNN and SVM

While large data dependent techniques have made advances in between-genre classification, the identification of subtypes within a genre has largely been overlooked. In this paper, we approach automatic classification of within-genre Bundeli folk music into its subgenres; Gaari, Rai and Phag. Bundeli, which is a dominant dialect spoken in a large belt of Uttar Pradesh and Madhya Pradesh has a ri...

متن کامل

Calculating Similarity of Folk Song Variants with Melody-based Features

As folk songs live largely through oral transmission, there usually is no standard form of a song each performance of a folk song may be unique. Different interpretations of the same song are called song variants, all variants of a song belong to the same variant type. In the paper, we explore how various melody-based features relate to folk song variants. Specifically, we explore whether we ca...

متن کامل

Classification of Music Signals in the Visual Domain

With the huge increase in the availability of digital music, it has become more important to automate the task of querying a database of musical pieces. At the same time, a computational solution of this task might give us an insight into how humans perceive and classify music. In this paper, we discuss our attempts to classify music into three broad categories: rock, classical and jazz. We dis...

متن کامل

Melodic similarity among folk songs: An annotation study on similarity- based categorization in music

In this article we determine the role of different musical features for the human categorization of folk songs into tune families in a large collection of Dutch folk songs. Through an annotation study we investigate the relation between musical features, perceived similarity and human categorization in music. We introduce a newly developed annotation method which is used to create an annotation...

متن کامل

A Factored Language Model of Quantized Pitch and Duration

This paper investigates a novel statistical approach to music classification that utilizes recent technology developed in the domain of natural language processing. Specifically, we investigate the use of factored language models (FLMs) for the task of producing conditional probability distributions to model origin-specific folk songs. In our model, pitch cluster and quantized duration are empl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014